Query Translation for Cross-Language Information Retrieval using Multilingual Word Clusters
نویسندگان
چکیده
In Cross-Language Information Retrieval, finding the appropriate translation of the source language query has always been a difficult problem to solve. We propose a technique towards solving this problem with the help of multilingual word clusters obtained from multilingual word embeddings. We use word embeddings of the languages projected to a common vector space on which a community-detection algorithm is applied to find clusters such that words that represent the same concept from different languages fall in the same group. We utilize these multilingual word clusters to perform query translation for Cross-Language Information Retrieval for three languages English, Hindi and Bengali. We have experimented with the FIRE 2012 and Wikipedia datasets and have shown improvements over several standard methods like dictionarybased method, a transliteration-based model and Google Translate.
منابع مشابه
Supporting Multilingual Information Retrieval in Web Applications: An English-Chinese Web Portal Experiment
Cross-language information retrieval (CLIR) and multilingual information retrieval (MLIR) techniques have been widely studied, but they are not often applied to and evaluated for Web applications. In this paper, we present our research in developing and evaluating a multilingual English-Chinese Web portal in the business domain. A dictionary-based approach has been adopted that combines phrasal...
متن کاملExperiments in Multilingual Information Retrieval
The multilingual information retrieval system of the future will need to be able to retrieve documents across language boundaries. This extension of the classical IR problem is particularly challenging, as signiicant resources are required to perform query translation. At Xerox, we are working to build a multilingual IR system and conducting a series of experiments to understand what factors ar...
متن کاملThe Effects of the Relevance-Based Superimposition Model in Cross-Language Information Retrieval
We propose a cross-language information retrieval method that is based on document feature modification and query translation using a dictionary extracted from comparable corpora. In this paper, we show the language-independent effectiveness of our document feature modification model for dealing with semantic ambiguity, and demonstrate the practicality of the proposed method for extracting mult...
متن کاملCross Lingual Information Retrieval Using Search Engine and Data Mining
-With the explosive growth of international users, distributed information and the number of linguistic resources, accessible throughout the World Wide Web, information retrieval has become crucial for users to find, retrieve and understand relevant information, in any language and form. CrossLanguage Information Retrieval (CLIR) is a subfield of Information Retrieval which provides a query in ...
متن کاملCross Language Information Retrieval Using Multilingual Ontology as Translation Base
This paper reports an experiment to evaluate a Cross Language Information Retrieval (CLIR) system that uses a multilingual ontology to improve query translation in the travel domain. The ontology-based approach significantly outperformed the Machine Readable Dictionary translation baseline using Mean Average Precision as a metric in a user-centered experiment.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016